ESM 244 Lab 1:

Objectives:

Packages:

library(tidyverse)
library(sf)
library(leaflet)
library(tmap)

1. Review

Data for Census tract-level poverty levels and environmental hazard from US Department of Housing and Urban Development (http://portal.hud.gov/hudportal/hud).

“The environmental health hazard exposure index summarizes potential exposure to harmful toxins at a neighborhood level. Potential health hazards exposure is a linear combination of standardized EPA estimates of air quality carcinogenic (c), respiratory (r) and neurological (n) hazards with i indexing census tracts.”

“The Low Poverty Index captures the depth and intensity of poverty in a given neighborhood. The index uses both family poverty rates and public assistance receipt, in the form of cash-welfare, such as Temporary Assistance for Needy Families (TANF). The index is a linear combination of two vectors: the family poverty rate (pv) and the percentage of households receiving public assistance (pa).”

2. California counties (spatial data)

Data: CA.gov California Open Data Portal: https://data.ca.gov/dataset/ca-geographic-boundaries

File types (see more at https://www.census.gov/geo/maps-data/data/tiger-line.html):

  • Feature geometry: .shp
  • Index of feature geometry: .shx
  • Character encoding: .cpg
  • Attribute information: .dbf
  • Coodinate system information: .prj
  • International Organization for Standardization (ISO 191) metadata: .shp.iso.xml
  • ISO 191 (entity and attribute) metatdata: .shp.ea.iso.xml

Attribute variables we’ll use:

  1. Use st_read() in the sf package to read in shape file data. You’ll need to include the directory (just “.” if in the project working directory) and the file prefix argument ‘layer = “prefix_here”’.
ca_counties <- st_read(dsn = ".", layer = "CA_Counties_TIGER2016")

A really cool thing about the sf package is that geometries are sticky - that means that we basically get to work with spatial attributes like a normal tibble/data frame, but the geometries (spatial information) stick to it.

  1. Select only the ALAND attribute + County name. Notice that geometries stick. Use plot() to see counties.
ca_land <- ca_counties %>% 
  select(NAME, ALAND)

# plot(ca_land)
  1. Join the spatial data to income data (ca_pop_inc.csv)
# Read pop/income data, then make sure county names column matches
ca_pop_inc <- read_csv("ca_pop_inc.csv") %>% 
  rename(NAME = COUNTY)

# Join the two: 
ca_df <- full_join(ca_land, ca_pop_inc) %>% 
  select(NAME, MedFamilyIncome)
  1. Use geom_sf to create a map using ggplot
# Make a map: 
ca_income <- ggplot(ca_df) +
  geom_sf(aes(fill = MedFamilyIncome), color = "white", size = 0.2) +
  scale_fill_gradientn(colors = c("blue","mediumorchid1","orange")) +
  theme_minimal()

ca_income

  1. …or using leaflet
leaflet(ca_df) %>% 
  addPolygons() 
## Warning: sf layer is not long-lat data
## Warning: sf layer has inconsistent datum (+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 +k=1.0 +units=m +nadgrids=@null +wktext +no_defs).
## Need '+proj=longlat +datum=WGS84'
# Oh no, the projection is wrong! We need it to match the projection that leaflet uses (WGS84)

ca_df_transform <- st_transform(ca_df, crs = 4326)

# Now try that again...

leaflet(ca_df_transform) %>% 
  addTiles() %>% # Adds bg
  addPolygons(weight = 1.0,
              opacity = 1.0,
              color = "white",
              fillOpacity = 0.5,
              fillColor = ~colorQuantile("YlOrRd", MedFamilyIncome)(MedFamilyIncome)
              ) # Adds polygons
  1. …or using tmap!
tmap_mode("view")
## tmap mode set to interactive viewing
## tmap mode set to interactive viewing
# if (Sys.getenv("USER") != "CRAN")

tm_shape(ca_df_transform) + tm_fill("MedFamilyIncome", alpha = 0.5)